perm filename DVITYP.WEB[WEB,ALS]3 blob
sn#672760 filedate 1982-08-12 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00015 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00003 00002 % This program by D. E. Knuth is not copyrighted and can be used freely.
C00006 00003 @* Introduction.
C00015 00004 @* The character set.
C00023 00005 @* Device-independent file format.
C00060 00006 @* Input from binary files.
C00073 00007 @* Reading the font information.
C00091 00008 @* Optional modes of output.
C00107 00009 @* Low level output routines.
C00109 00010 @* Translation to symbolic form.
C00132 00011 @* Finding the postamble and the starting page.
C00139 00012 @* Reading the postamble.
C00152 00013 @* The main program.
C00155 00014 @* System-dependent changes.
C00156 00015 @* Index.
C00157 ENDMK
C⊗;
% This program by D. E. Knuth is not copyrighted and can be used freely.
% But don't try to use it extensively, since it hasn't been very well
% debugged yet... This is an experimental version.
% Here is TeX material that gets inserted after \input webhdr
\def\hang{\hangindent 3em\ \unskip\!}
\chcode@@=13 \def@@{\penalty999\ } % ties words together
\def\TeX{T\hbox{\hskip-.1667em\lower.424ex\hbox{E}\hskip-.125em X}}
\font b=cmr9 \def\mc{\:b} % medium caps for names like PASCAL
\def\PASCAL{{\mc PASCAL}}
\def\(#1){} % this is used to make module names sort themselves better
\def\9#1{} % this is used for sort keys in the index
\font D=cmtt at 15truept % font used in the title line below (only)
\font E=cmr7 at 14truept % font used in the title line below (only)
\def\title{DVI$\,$\lowercase{type}}
\def\contentspagenumber{401}
\def\topofcontents{\topspace 0pt
\vfill
\ctrline{\:E The {\:D DVItype} processor}
\vskip 15pt
\ctrline{(Version 0.5, semi-debugged)}
\vfill}
\def\botofcontents{\vfill
\ctrline{\ragged0\spaceskip0pt\xspaceskip0pt\baselineskip9pt
\hbox par 5in{\:bThe preparation of this report
was supported in part by the National Science
Foundation under grants IST-8201926 and MCS-7723738;
by Office of Naval Research grant N00014-81-K-0330;
and by the System Development Foundation. `\TeX' is a
trademark of the American Mathematical Society.}}}
\setcount0 \contentspagenumber
\topofcontents
\ctrline{(replace this page by the contents page printed later)}
\botofcontents
\mark{1}\eject
@* Introduction.
The \.{DVItype} utility program reads binary device-independent (``\.{DVI}'')
files that are produced by document compilers such as \TeX, and converts them
into symbolic form. This program has two chief purposes: (1)@@It can be used to
determine whether a \.{DVI} file is valid or invalid, when diagnosing
compiler errors; and (2)@@it serves as an example of a program that reads
\.{DVI} files correctly, for system programmers who are developing
\.{DVI}-related software.
Goal number (2) needs perhaps a bit more explanation. Programs for
typesetting need to be especially careful about how they do arithmetic; if
rounding errors accumulate, margins won't be straight, vertical rules
won't line up, and so on. But if rounding is done everywhere, even in the
midst of words, there will be uneven spacing between the letters, and that
looks bad. Human eyes notice differences of a thousandth of an inch in the
positioning of lines that are close together; on low resolution devices,
where rounding produces effects four times as great as this, the problem
is especially critical. Experience has shown that unusual care is needed
even on high-resolution equipment; for example, a mistake in the sixth
significant hexadecimal place of a constant once led to a difficult-to-find
bug in some software for the Alphatype CRS, which has a resolution of 5333
pixels per inch (make that 5333.33333333 pixels per inch). The document
compilers that generate \.{DVI} files make certain assumptions about the
arithmetic that will be used by \.{DVI}-reading software, and if these
assumptions are violated the results will be of inferior quality.
Therefore the present program is intended as a guide to proper procedure
in the critical places where a bit of subtlety is involved.
The first \.{DVItype} program was designed by David Fuchs in 1979, and it
went through several versions on different computers as the format of
\.{DVI} files was evolving to its present form.
The |banner| string defined here should be changed whenever \.{DVItype}
gets modified.
@d banner=='This is DVItype, Version 0.5' {printed when the program starts}
@ This program is written in standard \PASCAL, except where it is necessary
to use extensions; for example, \.{DVItype} must read files whose names
are dynamically specified, and that would be impossible in pure \PASCAL.
All places where nonstandard constructions are used have been listed in
the index under ``system dependencies.''
@!@↑system dependencies@>
One of the extensions to standard \PASCAL\ that we shall deal with is the
ability to move to a random place in a binary file; another is to
determine the length of a binary file. Such extensions are not necessary
for reading \.{DVI} files, and they are not important for efficiency
reasons either---an infrequently used program like \.{DVItype} does not
have to be efficient. But they are included there because of \.{DVItype}'s
r\accent'17ole as a model of a \.{DVI} reading routine, since other \.{DVI}
processors ought to be highly efficient. If \.{DVItype} is being used with
\PASCAL s for which random file positioning is not efficiently available,
the following definition should be changed from |true| to |false|; in such
cases, \.{DVItype} will simply read through the \.{DVI} file twice,
instead of doing so slightly more than once, so the running time for input
will increase by roughly a factor of two.
Another extension is to use a default |case| as in \.{TANGLE}, \.{WEAVE},
etc.
@d random_reading==true {should we skip around in the file?}
@d othercases == others: {default for cases not listed explicitly}
@d endcases == @+end {follows the default case in an extended |case| statement}
@f othercases == else
@f endcases == end
@ The binary input comes from |dvi_file|, and the symbolic output is written
on \PASCAL's standard |output| file. The term |print| is used instead of
|write| when this program writes on |output|, so that all such output
could easily be redirected if desired.
@d print(#)==write(#)
@d print_ln(#)==write_ln(#)
@p program DVI_type(@!dvi_file,@!output);
label @<Labels in the outer block@>@/
const @<Constants in the outer block@>@/
type @<Types in the outer block@>@/
var@?@<Globals in the outer block@>@/
procedure initialize; {this procedure gets things started properly}
var i:integer; {loop index for initializations}
begin print_ln(banner);@/
@<Set initial values@>@/
end;
@ If the program has to stop prematurely, it goes to the
`|final_end|'.
@d final_end=9999 {label for the end of it all}
@<Labels...@>=final_end;
@ The following parameters can be changed at compile time to extend or
reduce \.{DVItype}'s capacity.
@<Constants...@>=
@!max_fonts=100; {maximum number of distinct fonts per \.{DVI} file}
@!max_widths=10000; {maximum number of different characters among all fonts}
@!line_length=80; {bracketed lines of output will be at most this long}
@!terminal_line_length=150; {maximum number of characters input in a single
line of input from the terminal}
@!stack_size=100; {\.{DVI} files shouldn't |push| beyond this depth}
@!name_size=1000; {total length of all font file names}
@!name_length=50; {a file name shouldn't be longer than this}
@ Here are some macros for common programming idioms.
@d incr(#) == #←#+1 {increase a variable by unity}
@d decr(#) == #←#-1 {decrease a variable by unity}
@d do_nothing == {empty statement}
@* The character set.
Like all programs written with the \.{WEB} system, \.{DVItype} can be
used with any character set. But it uses ascii code internally, because
the programming for portable input-output is easier when a fixed internal
code is used, and because \.{DVI} files use ascii code for file names
and certain other strings.
The next few modules of \.{DVItype} have therefore been copied frolthe
analogous ones in the \.{WEB} system routines. They have been considerably
simplified, since \.{DVItype} need not deal with the controversial
ascii codes less than @'40.
@<Types...@>=
@!ascii_code=" ".."~"; {a subrange of the integers}
@ The original \PASCAL\ compiler was designed in the late 60s, when six-bit
character sets were common, so it did not make provision for lower case
letters. Nowadays, of course, we need to deal with both upper and lower case
alphabets in a convenient way, especially in a program like \.{DVItype}.
So we shall assume that the \PASCAL\ system being used for \.{DVItype}
has a character set containing at least the standard visible characters
of ascii code (|"!"| through |"~"|).
Some \PASCAL\ compilers use the original name |char| for the data type
associated with the characters in text files, while other \PASCAL s
consider |char| to be a 64-element subrange of a larger data type that has
some other name. In order to accommodate this difference, we shall use
the name |text_char| to stand for the data type of the characters in the
output file. We shall also assume that |text_char| consists of
the elements |chr(first_text_char)| through |chr(last_text_char)|,
inclusive. The following definitions should be adjusted if necessary.
@↑system dependencies@>
@d text_char == char {the data type of characters in text files}
@d first_text_char=0 {ordinal number of the smallest element of |text_char|}
@d last_text_char=127 {ordinal number of the largest element of |text_char|}
@<Types...@>=
@!text_file=packed file of text_char;
@ The \.{DVItype} processor converts between ascii code and
the user's external character set by means of arrays |xord| and |xchr|
that are analogous to \PASCAL's |ord| and |chr| functions.
@<Globals...@>=
@!xord: array [text_char] of ascii_code;
{specifies conversion of input characters}
@!xchr: array [ascii_code] of text_char;
{specifies conversion of output characters}
@ Under our assumption that the visible characters of standard ascii are
all present, the following assignment statements initialize the
|xchr| array properly, without needing any system-dependent changes.
@<Set init...@>=
xchr[@'40]←' ';
xchr[@'41]←'!';
xchr[@'42]←'"';
xchr[@'43]←'#';
xchr[@'44]←'$';
xchr[@'45]←'%';
xchr[@'46]←'&';
xchr[@'47]←'''';@/
xchr[@'50]←'(';
xchr[@'51]←')';
xchr[@'52]←'*';
xchr[@'53]←'+';
xchr[@'54]←',';
xchr[@'55]←'-';
xchr[@'56]←'.';
xchr[@'57]←'/';@/
xchr[@'60]←'0';
xchr[@'61]←'1';
xchr[@'62]←'2';
xchr[@'63]←'3';
xchr[@'64]←'4';
xchr[@'65]←'5';
xchr[@'66]←'6';
xchr[@'67]←'7';@/
xchr[@'70]←'8';
xchr[@'71]←'9';
xchr[@'72]←':';
xchr[@'73]←';';
xchr[@'74]←'<';
xchr[@'75]←'=';
xchr[@'76]←'>';
xchr[@'77]←'?';@/
xchr[@'100]←'@@';
xchr[@'101]←'A';
xchr[@'102]←'B';
xchr[@'103]←'C';
xchr[@'104]←'D';
xchr[@'105]←'E';
xchr[@'106]←'F';
xchr[@'107]←'G';@/
xchr[@'110]←'H';
xchr[@'111]←'I';
xchr[@'112]←'J';
xchr[@'113]←'K';
xchr[@'114]←'L';
xchr[@'115]←'M';
xchr[@'116]←'N';
xchr[@'117]←'O';@/
xchr[@'120]←'P';
xchr[@'121]←'Q';
xchr[@'122]←'R';
xchr[@'123]←'S';
xchr[@'124]←'T';
xchr[@'125]←'U';
xchr[@'126]←'V';
xchr[@'127]←'W';@/
xchr[@'130]←'X';
xchr[@'131]←'Y';
xchr[@'132]←'Z';
xchr[@'133]←'[';
xchr[@'134]←'\';
xchr[@'135]←']';
xchr[@'136]←'↑';
xchr[@'137]←'_';@/
xchr[@'140]←'`';
xchr[@'141]←'a';
xchr[@'142]←'b';
xchr[@'143]←'c';
xchr[@'144]←'d';
xchr[@'145]←'e';
xchr[@'146]←'f';
xchr[@'147]←'g';@/
xchr[@'150]←'h';
xchr[@'151]←'i';
xchr[@'152]←'j';
xchr[@'153]←'k';
xchr[@'154]←'l';
xchr[@'155]←'m';
xchr[@'156]←'n';
xchr[@'157]←'o';@/
xchr[@'160]←'p';
xchr[@'161]←'q';
xchr[@'162]←'r';
xchr[@'163]←'s';
xchr[@'164]←'t';
xchr[@'165]←'u';
xchr[@'166]←'v';
xchr[@'167]←'w';@/
xchr[@'170]←'x';
xchr[@'171]←'y';
xchr[@'172]←'z';
xchr[@'173]←'{';
xchr[@'174]←'|';
xchr[@'175]←'}';
xchr[@'176]←'~';
@ The following system-independent code makes the |xord| array contain a
suitable inverse to the information in |xchr|.
@<Set init...@>=
for i←first_text_char to last_text_char do xord[chr(i)]←@'40;
for i←" " to "~" do xord[xchr[i]]←i;
@* Device-independent file format.
Before we get into the details of \.{DVItype}, we need to know exactly
what \.{DVI} files are. The form of such files was designed by David R.
Fuchs in 1979. Almost any reasonable device can be driven by a program
that takes \.{DVI} files as input, and dozens of such \.{DVI}-to-whatever
programs have been written. Thus, it is possible to print the output of
document compilers like \TeX\ on many different kinds of equipment.
A \.{DVI} file is a stream of 8-bit bytes, which may be regarded as a
series of commands in a machine-like language. The first byte of each command
is the operation code, and this code is followed by zero or more bytes
that provide parameters to the command. The parameters themselves may consist
of several consecutive bytes; for example, the `|set_rule|' command has two
parameters, each of which is four bytes long. Parameters are usually
regarded as nonnegative integers; but four-byte-long parameters,
and shorter parameters that denote distances, can be
either positive or negative. Such parameters are given in two's complement
notation. For example, a two-byte-long distance parameter has a value between
$-2↑{15}$ and $2↑{15}-1$.
@.DVI {\rm files}@>
A \.{DVI} file consists of a sequence of one or more ``pages,'' followed by
a ``postamble.'' A ``page'' consists of a |bop| command, followed by any number
of other commands that tell where characters are to be placed on a physical
page, followed by an |eop| command. The pages appear in the order that \TeX\
generated them. If we ignore |nop| commands (which are allowed between
any two commands in the file), each |eop| command is immediately followed by
a |bop| command, or by a |pst| command; in the latter case, there are no
more pages in the file, and the remaining bytes form the postamble.
Further details about the postamble will be explained later.
Some parameters in \.{DVI} commands are ``pointers.'' These are four-byte
quantities that give the location number of some other byte in the file;
the first byte is number@@0, then comes number@@1, and so on. For example,
one of the parameters of a |bop| command points to the previous |bop|;
this makes it feasible to read the pages in backwards order, in case you
are producing output on devices that stack their output face up. If the
first page on a \.{DVI} file occupies bytes 0 to 99, and if the second
page occupies bytes 100 to 299, then the |bop| that starts in byte 100
points to 0 and the |bop| thats starts in byte 300 points to 100. (The
first |bop|, i.e., the one that starts in byte 0, has a pointer of $-1$.)
@ The \.{DVI} format is intended to be both compact and easily interpreted
by a machine. Compactness is achieved by making most of the information
implicit instead of explicit; when a \.{DVI}-reading program reads the
commands for a page, it keeps track of several quantities: (a)@@The current
font |f| is an integer; this value is changed only
by \\{fnt} and \\{fnt\_num} commands. (b)@@The current position on the page
is given by two numbers called the horizontal and vertical coordinates,
|h| and |v|. Both coordinates are zero at the upper left corner of the page;
moving to the right corresponds to increasing the horizontal coordinate, and
moving down corresponds to increasing the vertical coordinate. Thus, the
coordinates are essentially Cartesian, except that vertical directions are
flipped; the Cartesian version of |(h,v)| would be |(h,-v)|. (c)@@The
current spacing amounts are given by four numbers |w|, |x|, |y|, and |z|,
where |w| and@@|x| are used for horizontal spacing and where |y| and@@|z|
are used for vertical spacing. (d)@@There is a stack containing
|(h,v,w,x,y,z)| values; the \.{DVI} commands |push| and |pop| are used to
change the current level of operation. Note that the current font@@|f| is
not pushed and popped; the stack contains only information about
positioning.
The values of |h|, |v|, |w|, |x|, |y|, and |z| are signed integers having up
to 32 bits, including the sign. Since they represent physical distances,
there is a small unit of measurement such that increasing |h| by@@1 means
moving a certain tiny distance to the right. The actual unit of
measurement is variable, as explained below.
@ Here is list of all the commands that may appear in a \.{DVI} file. With
each command we give its symbolic name (e.g., |bop|), its opcode byte
(e.g., 129), and its parameters (if any). The parameters are followed
by a bracketed number telling how many bytes they occupy; for example,
`|p[4]|' means that parameter |p| is four bytes long.
\yskip\hang|set_char_0| 0. Typeset character number@@0 from font@@|f|
such that the reference point of the character is at |(h,v)|. Then
increase |h| by the width of that character. Note that a character may
have zero or negative width, so one cannot be sure that |h| will advance
after this command; but |h| usually does increase.
\yskip\hang|set_char_1| through |set_char_127| (opcodes 1 to 127).
Do the operations of |set_char_0|, but use the appropriate character number
instead of character@@0.
\yskip\hang|set1| 128 |c[1]|. Same as |set_char_0|, except that character
number@@|c| is typeset. \TeX82 uses this command for characters in the
range |128≤c<256|.
\yskip\hang|set2| 129 |c[2]|. Same as |set1|, except that@@|c| is two
bytes long, so it is in the range |0≤c<65536|. \TeX82 never uses this
command, which is intended for processors that deal with oriental languages;
but \.{DVItype} will allow character codes greater than 255, assuming that
they all have the same width as character 256.
@↑oriental characters@>@↑Chinese characters@>@↑Japanese characters@>
\yskip\hang|set3| 130 |c[3]|. Same as |set1|, except that@@|c| is three
bytes long, so it can be as large as $2↑{24}-1$. Not even the Chinese
language has this many characters, but this command might prove useful
in some yet unforeseen way.
\yskip\hang|set4| 131 |c[4]|. Same as |set1|, except that@@|c| is four
bytes long, possibly even negative. Imagine that.
\yskip\hang|set_rule| 132 |a[4]| |b[4]|. Typeset a solid black rectangle
of height |a| and width |b|, with its bottom left corner at |(h,v)|. Then
set |h←h+b|. If either |a≤0| or |b≤0|, nothing should be typeset. Note
that if |b<0|, the value of |h| will decrease even though nothing else happens.
Programs that typeset from \.{DVI} files should be careful to make the rules
line up carefully with digitized characters, as explained in connection with
the |rule_pixels| subroutine below.
\yskip\hang|put1| 133 |c[1]|. Typeset character number@@|c| from font@@|f|
such that the reference point of the character is at |(h,v)|. (The `put'
commands are exactly like the `set' commands, except that they simply put out a
character or a rule without moving the reference point afterwards.)
\yskip\hang|put2| 134 |c[2]|. Same as |set2|, except that |h| is not changed.
\yskip\hang|put3| 135 |c[3]|. Same as |set3|, except that |h| is not changed.
\yskip\hang|put4| 136 |c[4]|. Same as |set4|, except that |h| is not changed.
\yskip\hang|put_rule| 137 |a[4]| |b[4]|. Same as |set_rule|, except that
|h| is not changed.
\yskip\hang|nop| 138. No operation, do nothing. Any number of |nop|'s
may occur between \.{DVI} commands, but a |nop| cannot be inserted between
a command and its parameters or between two parameters.
\yskip\hang|bop| 139 $c↓0[4]$ $c↓1[4]$ $\ldots$ $c↓9[4]$ $p[4]$. Beginning
of a page: Set |(h,v,w,x,y,z)←(0,0,0,0,0,0)| and set the stack empty. Set
the current font |f| to an undefined value. The ten $c↓i$ parameters can
be used to identify pages, if a user wants to print only part of a \.{DVI}
file; \TeX82 gives them the values of \.{\\count0} $\ldots$ \.{\\count9}
at the time \.{\\shipout} was invoked for this page. The parameter |p|
points to the previous |bop| command in the file, where the first |bop|
has $p=-1$.
\yskip\hang|eop| 140. End of page: Print what you have read since the
previous |bop|. At this point the stack should be empty. (The \.{DVI}-reading
programs that drive most output devices will have kept a buffer of the
material that appears on the page that has just ended. This material is
largely, but not entirely, in order by |v| coordinate and (for fixed |v|) by
|h|@@coordinate; so it usually needs to be sorted into some order that is
appropriate for the device in question. \.{DVItype} does not do such sorting.)
\yskip\hang|push| 141. Push the current values of |(h,v,w,x,y,z)| onto the
top of the stack; do not change any of these values. Note that |f| is
not pushed.
\yskip\hang|pop| 142. Pop the top six values off of the stack and assign
them to |(h,v,w,x,y,z)|. The number of pops should never exceed the number
of pushes, since it would be highly embarrassing if the stack were empty
at the time of a |pop| command.
\yskip\hang|right1| 143 |b[1]|. Set |h←h+b|, i.e., move right |b| units.
The parameter is a signed number in two's complement notation, |-128≤b<128|;
if |b<0|, the reference point actually moves left.
\yskip\hang|right2| 144 |b[2]|. Same as |right1|, except that |b| is a
two-byte quantity in the range |-32768≤b<32768|.
\yskip\hang|right3| 145 |b[3]|. Same as |right1|, except that |b| is a
three-byte quantity in the range |@t$-2↑{23}$@>≤b<@t$2↑{23}$@>|.
\yskip\hang|right4| 146 |b[4]|. Same as |right1|, except that |b| is a
four-byte quantity in the range |@t$-2↑{31}$@>≤b<@t$2↑{31}$@>|.
\yskip\hang|w0| 147. Set |h←h+w|; i.e., move right |w| units. With luck,
this parameterless command will usually suffice, because the same kind of motion
will occur several times in succession; the following commands explain how
|w| gets particular values.
\yskip\hang|w1| 148 |b[1]|. Set |w←b| and |h←h+b|. The value of |b| is a
signed quantity in two's complement notation, |-128≤b<128|. This command
changes the current |w|@@spacing and moves right by |b|.
\yskip\hang|w2| 149 |b[2]|. Same as |w1|, but |b| is a two-byte-long
parameter, |-32768≤b<32768|.
\yskip\hang|w3| 150 |b[3]|. Same as |w1|, but |b| is a three-byte-long
parameter, |@t$-2↑{23}$@>≤b<@t$2↑{23}$@>|.
\yskip\hang|w4| 151 |b[4]|. Same as |w1|, but |b| is a four-byte-long
parameter, |@t$-2↑{31}$@>≤b<@t$2↑{31}$@>|.
\yskip\hang|x0| 152. Set |h←h+x|; i.e., move right |x| units. The `|x|'
commands are like the `|w|' commands except that they involve |x| instead
of |w|.
\yskip\hang|x1| 153 |b[1]|. Set |x←b| and |h←h+b|. The value of |b| is a
signed quantity in two's complement notation, |-128≤b<128|. This command
changes the current |x|@@spacing and moves right by |b|.
\yskip\hang|x2| 154 |b[2]|. Same as |x1|, but |b| is a two-byte-long
parameter, |-32768≤b<32768|.
\yskip\hang|x3| 155 |b[3]|. Same as |x1|, but |b| is a three-byte-long
parameter, |@t$-2↑{23}$@>≤b<@t$2↑{23}$@>|.
\yskip\hang|x4| 156 |b[4]|. Same as |x1|, but |b| is a four-byte-long
parameter, |@t$-2↑{31}$@>≤b<@t$2↑{31}$@>|.
\yskip\hang|down1| 157 |a[1]|. Set |v←v+a|, i.e., move down |a| units.
The parameter is a signed number in two's complement notation, |-128≤a<128|;
if |a<0|, the reference point actually moves up.
\yskip\hang|down2| 158 |a[2]|. Same as |down1|, except that |a| is a
two-byte quantity in the range |-32768≤a<32768|.
\yskip\hang|down3| 159 |a[3]|. Same as |down1|, except that |a| is a
three-byte quantity in the range |@t$-2↑{23}$@>≤a<@t$2↑{23}$@>|.
\yskip\hang|down4| 160 |a[4]|. Same as |down1|, except that |a| is a
four-byte quantity in the range |@t$-2↑{31}$@>≤a<@t$2↑{31}$@>|.
\yskip\hang|y0| 161. Set |v←v+y|; i.e., move down |y| units. With luck,
this parameterless command will usually suffice, because the same kind of motion
will occur several times in succession; the following commands explain how
|y| gets particular values.
\yskip\hang|y1| 162 |a[1]|. Set |y←a| and |v←v+a|. The value of |a| is a
signed quantity in two's complement notation, |-128≤a<128|. This command
changes the current |y|@@spacing and moves down by |a|.
\yskip\hang|y2| 163 |a[2]|. Same as |y1|, but |a| is a two-byte-long
parameter, |-32768≤a<32768|.
\yskip\hang|y3| 164 |a[3]|. Same as |y1|, but |a| is a three-byte-long
parameter, |@t$-2↑{23}$@>≤a<@t$2↑{23}$@>|.
\yskip\hang|y4| 165 |a[4]|. Same as |y1|, but |a| is a four-byte-long
parameter, |@t$-2↑{31}$@>≤a<@t$2↑{31}$@>|.
\yskip\hang|z0| 166. Set |v←v+z|; i.e., move down |z| units. The `|z|' commands
are like the `|y|' commands except that they involve |z| instead of |y|.
\yskip\hang|z1| 167 |a[1]|. Set |z←a| and |v←v+a|. The value of |a| is a
signed quantity in two's complement notation, |-128≤a<128|. This command
changes the current |z|@@spacing and moves down by |a|.
\yskip\hang|z2| 168 |a[2]|. Same as |z1|, but |a| is a two-byte-long
parameter, |-32768≤a<32768|.
\yskip\hang|z3| 169 |a[3]|. Same as |z1|, but |a| is a three-byte-long
parameter, |@t$-2↑{23}$@>≤a<@t$2↑{23}$@>|.
\yskip\hang|z4| 170 |a[4]|. Same as |z1|, but |a| is a four-byte-long
parameter, |@t$-2↑{31}$@>≤a<@t$2↑{31}$@>|.
\yskip\hang|fnt_num_0| 171. Set |f←0|.
\yskip\hang|fnt_num_1| through |fnt_num_63| (opcodes 172 to 234). Set
|f←1|, $\ldotss$, |f←63|, respectively.
\yskip\hang|fnt1| 235 |n[1]|. Set |f←n|. \TeX82 uses this command for font
numbers in the range |64≤n<256|.
\yskip\hang|fnt2| 236 |n[2]|. Same as |fnt1|, except that@@|n| is two
bytes long, so it is in the range |0≤n<65536|. \TeX82 never generates this
command, but large font numbers may prove useful for specifications of
color or texture, or they may be used for special fonts that have fixed
numbers in some external coding scheme.
\yskip\hang|fnt3| 237 |n[3]|. Same as |fnt1|, except that@@|n| is three
bytes long, so it can be as large as $2↑{24}-1$.
\yskip\hang|fnt4| 238 |n[4]|. Same as |fnt1|, except that@@|n| is four
bytes long; this is for the really big font numbers. The value $-1$
is forbidden, so the legal values of |f| are $-2↑{31}\L f<-1$ and
$0\L f<2↑{31}-1$.
\yskip\hang|xxx1| 239 |m[1]| |x[m]|. This command is undefined in
general; it functions as an $(m+2)$-byte |nop| unless special \.{DVI}-reading
programs are being used. \TeX82 generates this command when an \.{\\xsend}
appears, setting |m| to the number of bytes being sent. It is recommended that
|x| be a string having the form of a keyword followed by possible parameters
relevant to that keyword. Examples: |x='halftone fig22'| could mean
``insert a halftone from file fig22, with its reference point at |(h,v)|'';
|x='leftend 2'| and an appearance elsewhere of |x='rightend 2'| could mean
``draw a straight line from the left |(h,v)| position to the right one''
(where the `\.2' is an identifier to distinguish this straight line
from others in the file); |x='message Foo'| could mean ``display `\.{Foo}'
on the console of the printing device''; and so on. The command does not
change any of the status values |f|, |h|, |v|, |w|, |x|, |y|, |z| or the stack.
\yskip\hang|xxx2| 240 |m[2]| |x[m]|. Like |xxx1|, but |0≤m<65536|.
\yskip\hang|xxx3| 241 |m[2]| |x[m]|. Like |xxx1|, but |0≤m<@t$2↑{24}$@>|.
\yskip\hang|xxx4| 242 |m[2]| |x[m]|. Like |xxx1|, but |m| can be ridiculously
large.
\yskip\hang|pst| 243. Beginning of the postamble, see below.
\yskip\noindent Commands 244--255 are undefined at the present time.
@ @d set_char_0=0 {typeset character 0 and move right}
@d set1=128 {typeset a character and move right}
@d set_rule=132 {typeset a rule and move right}
@d put1=133 {typeset a character}
@d put_rule=137 {typeset a rule}
@d nop=138 {no operation}
@d bop=139 {beginning of page}
@d eop=140 {ending of page}
@d push=141 {save the current positions}
@d pop=142 {restore previous positions}
@d right1=143 {move right}
@d w0=147 {move right by |w|}
@d w1=148 {move right and set |w|}
@d x0=152 {move right by |x|}
@d x1=153 {move right and set |x|}
@d down1=157 {move down}
@d y0=161 {move down by |y|}
@d y1=162 {move down and set |y|}
@d z0=166 {move down by |z|}
@d z1=167 {move down and set |z|}
@d fnt_num_0=171 {set current font to 0}
@d fnt1=235 {set current font}
@d xxx1=239 {extension to \.{DVI} primitives}
@d pst=243 {postamble}
@d undefined_commands==244,245,246,247,248,249,250,251,252,253,254,255
@ The last page in a \.{DVI} file is followed by `|pst|'; this command
introduces the postamble, which summarizes important facts that \TeX\ has
accumulated about the file. The postamble has the form
$$\hbox{|p[4]| |n[4]| |d[4]| |m[4]| |l[4]| |u[4]| |s[2]| |t[2]|
$\langle\,$font definitions$\,\rangle$
$(-1)[4]$ |q[4]| |i[1]| 223's|[≥4]|}$$
Here |p| is a pointer to the final |bop| in the file. The next two parameters,
|n| and |d|, are positive integers that define the units of measurement;
they are the numerator and denominator of a fraction by which all dimensions
in the \.{DVI} file could be multiplied in order to get lengths in units
of $10↑{-7}$ meters. \.{DVI}-reading programs should do their basic arithmetic
with the unscaled units that appear in the file, multiplying by a conversion
factor only at the last step before outputting to another device; in this way,
rounding errors will not accumulate and there will be perfect agreement with
the assumptions of the document compiler that generated the \.{DVI} file.
The next parameter, |m|, is \TeX's \.{\\mag} parameter, i.e., 1000 times the
desired magnification. The actual fraction by which dimensions are multiplied
is therefore |mn/1000d|. Fancy \.{DVI}-reading programs allow users to
override the |m| setting when a \.{DVI} file is being printed.
Parameters |l| and |u| give respectively the height-plus-depth of the tallest
page and the width of the widest page, in the same units as other dimensions
of the file. These numbers might be used by a \.{DVI}-reading program to
position individual ``pages'' on large sheets of film or paper.
Parameter |s| is the maximum stack depth (i.e., the excess of |push| commands
over |pop| commands) needed to process this file. Then comes |t|, the total
number of pages (|bop| commands) present.
@ The postamble continues with font definitions, which are any number of
specifications having the form
$$\hbox{|f[4]| $c↓f[4]$ $s↓f[4]$ $d↓f[4]$ $a↓f[1]$ $l↓f[1]$ $n↓f[a↓f+l↓f]$.}$$
The first parameter in a font definition is the font number, $f$; this
must be different from $-1$ and distinct from the font numbers in other
definitions. (Note that the font definitions are followed by the four-byte
value $-1$, so it will be clear when the definitions have ended.) The
next parameter, $c↓f$, is the check sum that \TeX\ found in the \.{TFM}
file for this font; it should match the check sum of the font found by
@↑check sum@>
programs that read this \.{DVI} file. (Otherwise the font information has
changed since the time the \.{DVI} file was generated, and things may no
longer line up properly.)
Parameter $s↓f$ contains a fixed-point scale factor that is applied to the
character widths in font |f|; font dimensions in \.{TFM} files and other font
files are relative to this quantity. \TeX82 calls this the ``at size'' of
the font, since a particular font can be used at several different
magnifications. The value of $s↓f$ should be positive and less than
$2↑{27}$. It is given in the same units as the other dimensions of the file.
Parameter $d\f$ is similar to $s\f$; it is the ``design size'' found in
the \.{TFM} file, but expressed in \.{DVI} units that have not been
corrected for the magnification@@|m|. Thus, font |f| is to be used at
$ms↓f/1000d↓f$ times its normal size.
The remaining part of the font definition gives the external name of the font,
which is an ascii string of length $a↓f+l↓f$. The number $a↓f$ is the length
of the ``area'' or directory, and $l↓f$ is the length of the font name itself;
the standard local system font area is supposed to be used when $a↓f=0$.
The $n↓f$ field contains the area in its first $a↓f$ bytes.
@ The last part of the postamble, following the phony font number
$-1$, contains |q|, a pointer to the |pst| command that started the
postamble. An identification byte, |i|, comes next; currently this byte
is always set to@@2. (Some day we will set |i=3|, when \.{DVI} format
makes another incompatible change---perhaps in 1992.)
Following the |i| byte there are four or more bytes that are all equal to
the decimal number 223 (i.e., @'337 in octal). \TeX\ puts out four to seven of
these trailing bytes, until the total length of the file is a multiple of
four bytes, since this works out best on machines that pack four bytes per
word; but any number of 223's is allowed, as long as there are at least four
of them. In effect, 223 is a sort of signature that is added at the very end.
This curious way to finish off a \.{DVI} file makes it feasible for
\.{DVI}-reading programs to find the postamble first, on most computers,
even though \TeX\ wants to write the postamble last. Most operating
systems permit random access to individual words or bytes of a file, so
the \.{DVI} reader starts at the end and skips backwards over the 223's
until finding the identification byte. Then it backs up four bytes, reads
|q|, and goes to byte |q| of the file. This byte should, of course,
contain the value 243 (|pst|); now the postamble can be read, so the
\.{DVI} reader discovers all the information needed for typesetting the
pages. Note that it is also possible to skip through the \.{DVI} file at
reasonably high speed to locate a particular page, if that proves
desirable.
The reason for reading the postamble first is that the \.{DVI} reader must
know the widths of characters, in order to find out where things go on a page;
and it needs to know the names of the fonts, so that it can get their widths
from a \.{TFM} file or from some other kind of font-information file.
The reason for writing the postamble last is that \TeX\ can't put out all
the font names until it has finished generating the pages of the \.{DVI}
file, since new fonts can occur anywhere in a \TeX\ job; and the alternative
of sprinkling font definitions throughout a \.{DVI} file is unattractive,
since that would make it necessary to read the whole file even when
printing only one page. Furthermore, we wouldn't want to copy the
information in the first part of a \.{DVI} file to the end of another file
that begins with the postamble information, since the first part of a
\.{DVI} file is typically quite long.
Unfortunately, however, standard \PASCAL\ does not include the ability to
@↑system dependencies@>
access a random position in a file, or even to determine the length of a file.
Almost all systems nowadays provide the necessary capabilities, so \.{DVI}
format has been designed to work most efficiently with modern operating systems.
As noted above, \.{DVItype} will limit itself to the restrictions of standard
\PASCAL\ if |random_reading| is defined to be |false|.
@d id_byte=2 {identifies the kind of \.{DVI} files described here}
@* Input from binary files.
We have seen that a \.{DVI} file is a sequence of 8-bit bytes. The bytes
appear physically in what is called a `\!|packed file of
0..255|\unskip' in \PASCAL\ lingo.
Packing is system dependent, and many \PASCAL\ systems fail to implement
such files in a sensible way (at least, from the viewpoint of producing
good production software). For example, some systems treat all
byte-oriented files as text, looking for end-of-line marks and such
things. Therefore some system-dependent code is often needed to deal with
binary files, even though most of the program in this section of
\.{DVItype} is written in standard \PASCAL.
@↑system dependencies@>
One common way to solve the problem is to consider files of |integer|
numbers, and to convert an integer in the range $-2↑{31}\L x<2↑{31}$ to
a sequence of four bytes $(a,b,c,d)$ using the following code, which
avoids the controversial integer division of negative numbers:
$$\vbox{\halign{#\hfil\cr
|if x≥0 then a←x div @'100000000|\cr
|else begin x←(x+@'10000000000)+@'10000000000; a←x div @'100000000+128;|\cr
\quad|end|\cr
|x←x mod @'100000000;|\cr
|b←x div @'200000; x←x mod @'200000;|\cr
|c←x div @'400; d←x mod @'400;|\cr}}$$
The four bytes are then kept in a buffer and output one by one. (On 36-bit
computers, an additional division by 16 is necessary at the beginning.
Another way to separate an integer into four bytes is to use/abuse
\PASCAL's variant records, storing an integer and fetching bytes that are
packed in the same place; {\sl caveat implementor!\/}) It is also desirable
in some cases to read a hundred or so integers at a time, maintaining a
larger buffer.
We shall stick to simple \PASCAL\ in this program, for reasons of clarity,
even if such simplicity is sometimes unrealistic.
@<Types...@>=
@!eight_bits=0..255; {unsigned one-byte quantity}
@!byte_file=packed file of eight_bits; {files that contain binary data}
@ The program deals with two binary file variables: |dvi_file| is the main
input file that we are translating into symbolic form, and |tfm_file| is
the current font metric file from which character-width information is
being read.
@<Glob...@>=
@!dvi_file:byte_file; {the stuff we are \.{DVI}typing}
@!tfm_file:byte_file; {a font metric file}
@ To prepare these files for input, we |reset| them. An extension of \PASCAL\
is needed in the case of |tfm_file|, since we want to associate it with
external files whose names are specified dynamically (i.e., not known
at compile time). The following code assumes that `|reset(f,s)|' does this,
when |f| is a file variable and |s| is a string variable that specifies
the file name. If |eof(f)| is true immediately after |reset(f,s)| has acted,
we assume that no file named |s| is accessible.
@↑system dependencies@>
@p procedure open_dvi_file; {prepares to read packed bytes in |dvi_file|}
begin reset(dvi_file);
cur_loc←0;
end;
@#
procedure open_tfm_file; {prepares to read packed bytes in |tfm_file|}
begin reset(tfm_file,cur_name);
end;
@ If you looked carefully at the preceding code, you probably asked,
``What are |cur_loc| and |cur_name|?'' Good question. They're global
variables: |cur_loc| is the number of the byte about to be read next from
|dvi_file|, and |cur_name| is a string variable that will be set to the
current font metric file name before |open_tfm_file| is called.
@<Glob...@>=
@!cur_loc:integer; {where we are about to look, in |dvi_file|}
@!cur_name:packed array[1..name_length] of char; {external name,
with no lower case letters}
@ It turns out to be convenient to read four bytes at a time, when we are
inputting from \.{TFM} files. The input goes into global variables
|b0|, |b1|, |b2|, and |b3|, with |b0| getting the first byte and |b3|
the fourth.
@<Glob...@>=
@!b0,@!b1,@!b2,@!b3: eight_bits; {four bytes input at once}
@ The |read_tfm_word| procedure sets |b0| through |b3| to the next
four bytes in the current \.{TFM} file.
@↑system dependencies@>
@p procedure read_tfm_word;
begin read(tfm_file,b0); read(tfm_file,b1);
read(tfm_file,b2); read(tfm_file,b3);
end;
@ We shall use another set of simple functions to read the next byte or
bytes from |dvi_file|. There are seven possibilities, each of which is
treated as a separate function in order to minimize the overhead for
subroutine calls.
@↑system dependencies@>
@p function get_byte:integer; {returns the next byte, unsigned}
var b:eight_bits;
begin if eof(dvi_file) then get_byte←0
else begin read(dvi_file,b); incr(cur_loc); get_byte←b;
end;
end;
@#
function signed_byte:integer; {returns the next byte, signed}
var b:eight_bits;
begin read(dvi_file,b); incr(cur_loc);
if b<128 then signed_byte←b @+ else signed_byte←b-256;
end;
@#
function get_two_bytes:integer; {returns the next two bytes, unsigned}
var a,@!b:eight_bits;
begin read(dvi_file,a); read(dvi_file,b);
cur_loc←cur_loc+2;
get_two_bytes←a*256+b;
end;
@#
function signed_pair:integer; {returns the next two bytes, signed}
var a,@!b:eight_bits;
begin read(dvi_file,a); read(dvi_file,b);
cur_loc←cur_loc+2;
if a<128 then signed_pair←a*256+b
else signed_pair←(a-256)*256+b;
end;
@#
function get_three_bytes:integer; {returns the next three bytes, unsigned}
var a,@!b,@!c:eight_bits;
begin read(dvi_file,a); read(dvi_file,b); read(dvi_file,c);
cur_loc←cur_loc+3;
get_three_bytes←(a*256+b)*256+c;
end;
@#
function signed_trio:integer; {returns the next three bytes, signed}
var a,@!b,@!c:eight_bits;
begin read(dvi_file,a); read(dvi_file,b); read(dvi_file,c);
cur_loc←cur_loc+3;
if a<128 then signed_trio←(a*256+b)*256+c
else signed_trio←((a-256)*256+b)*256+c;
end;
@#
function signed_quad:integer; {returns the next four bytes, signed}
var a,@!b,@!c,@!d:eight_bits;
begin read(dvi_file,a); read(dvi_file,b); read(dvi_file,c); read(dvi_file,d);
cur_loc←cur_loc+4;
if a<128 then signed_quad←((a*256+b)*256+c)*256+d
else signed_quad←(((a-256)*256+b)*256+c)*256+d;
end;
@ Finally we come to the routines that are used only if |random_reading| is
|true|. The driver program below needs two such routines: |dvi_length| should
compute the total number of bytes in |dvi_file|, possibly also
causing |eof(dvi_file)| to be true; and |move_to_byte(n)|
should position |dvi_file| so that the next |get_byte| will read byte |n|,
starting with |n=0| for the first byte in the file.
@↑system dependencies@>
Such routines are, of course, highly system dependent. They are implemented
here in terms of two assumed system routines called |set_pos| and |cur_pos|.
The call |set_pos(f,n)| moves to item |n| in file |f|, unless |n| is
negative or larger than the total number of items in |f|; in the latter
case, |set_pos(f,n)| moves to the end of file |f|.
The call |cur_pos(f)| gives the total number of items in |f|, if
|eof(f)| is true; we use |cur_pos| only in such a situation.
@p function dvi_length:integer;
begin set_pos(dvi_file,-1); dvi_length←cur_pos(dvi_file);
end;
@#
procedure move_to_byte(n:integer);
begin set_pos(dvi_file,n); cur_loc←n;
end;
@* Reading the font information.
\.{DVI} file format does not include information about character widths, since
that would tend to make the files a lot longer. But a program that reads
a \.{DVI} file is supposed to know the widths of the characters that appear
in \\{set\_char} commands. Therefore \.{DVItype} looks at the font metric
(\.{TFM}) files for the fonts that are involved.
@.TFM {\rm files}@>
The character-width data appears also in other files (e.g., in \.{PXL} files
that contain bit patterns for printing characters on low-resolution devices);
thus, it is usually possible for \.{DVI} reading programs to get by with
accessing only one file per font. \.{DVItype} has a comparatively easy
task in this regard, since it needs only a few words of information from
each font; other \.{DVI}-to-printer programs may have to go to some pains to
deal with complications that arise when a large number of large font files
all need to be accessed simultaneously.
@ For purposes of this program, we need to know only two things about a
given character |c| in a given font |f|: (1)@@Is |c| a legal character
in@@|f|? (2)@@If so, what is the width of |c|? We also need to know the
symbolic name of each font, so it can be printed out, and we need to know
the approximate size of inter-word spaces in each font.
The answers to these questions appear implicitly in the following data
structures. The current number of known fonts is |nf|. Each known font has
an internal number |f|, where |0≤f<nf|; the external number of this font,
i.e., its font identification number in the \.{DVI} file, is
|font_num[f]|, and the external name of this font is the string that
occupies positions |font_name[f]| through |font_name[f+1]-1| of the array
|names|. The latter array consists of |ascii_code| characters, and
|font_name[nf]| is its first unoccupied position. A horizontal motion
less than |font_space[f]| will be treated as a `kern' that is not
indicated in the printouts that \.{DVItype} produces between brackets. The
legal characters run from |font_bc[f]| to |font_ec[f]|, inclusive; more
precisely, a given character |c| is valid in font |f| if and only if
|font_bc[f]≤c≤font_ec[f]| and |char_width(f)(c)≠invalid_width|.
(Exception: If |font_ec[f]=256|, all characters |c≥256| are valid and have
the same width |char_width(f)(256)|.)
@↑oriental characters@>@↑Chinese characters@>@↑Japanese characters@>
Finally, |char_width(f)(c)=width[width_base[f]+c]|, and |width_ptr| is the
first unused position of the |width| array.
@d char_width_end(#)==#]
@d char_width(#)==width[width_base[#]+char_width_end
@d invalid_width==@'17777777777
@<Glob...@>=
@!font_num:array [0..max_fonts] of integer; {external font numbers}
@!font_name:array [0..max_fonts] of 0..name_size; {starting positions
of external font names}
@!names:array [0..name_size] of ascii_code; {characters of names}
@!font_space:array [0..max_fonts] of integer; {boundary between ``small''
and ``large'' spaces}
@!font_bc:array [0..max_fonts] of integer; {beginning characters in fonts}
@!font_ec:array [0..max_fonts] of integer; {ending characters in fonts}
@!width_base:array [0..max_fonts] of 0..max_widths; {index into |width| table}
@!width:array [0..max_widths] of integer; {character widths, in \.{DVI} units}
@!nf:0..max_fonts; {the number of known fonts}
@!width_ptr:0..max_widths; {the number of known character widths}
@ @<Set init...@>=
nf←0; width_ptr←0; font_name[0]←0;
@ It is, of course, a simple matter to print the name of a given font.
@p procedure print_font(@!f:integer);
var k:0..name_size; {index into |names|}
begin if f=nf then print('undefined font!')
@.undefined font@>
else begin for k←font_name[f] to font_name[f+1]-1 do
print(xchr[names[k]]);
end;
end;
@ An auxiliary array |in_width| is used to hold the widths as they are
input. The global variable |tfm_check_sum| is set to the check sum that
appears in the current \.{TFM} file.
@<Glob...@>=
@!in_width:array[0..255] of integer; {\.{TFM} width data in \.{DVI} units}
@!tfm_check_sum:integer; {check sum found in |tfm_file|}
@ Here is a procedure that absorbs the necessary information from a
\.{TFM} file, assuming that the file has just been successfully reset
so that we are ready to read its first byte. (A complete description of
\.{TFM} file format appears in the documentation of \.{TFtoPL} and will
not be repeated here.) The procedure does not check the \.{TFM} file
for validity, nor does it give explicit information about what is
wrong with a \.{TFM} file that proves to be invalid; \.{DVI}-reading
programs need not do this, since \.{TFM} files are almost always valid,
and since the \.{TFtoPL} utility program has been specifically designed
to diagnose \.{TFM} errors. The procedure simply returns |false| if it
detects anything amiss in the \.{TFM} data.
There is a parameter, |z|, which represents the scaling factor being
used to compute the font dimensions; it must be in the range $0<z<2↑{27}$.
@p function in_TFM(@!z:integer):boolean; {input \.{TFM} data or return |false|}
label 9997, {go here when the format is bad}
9998, {go here when the information cannot be loaded}
9999; {go here to exit}
var k:integer; {index for loops}
@!lh:integer; {length of the header data, in four-byte words}
@!nw:integer; {number of words in the width table}
@!wp:0..max_widths; {new value of |width_ptr| after successful input}
@!alpha,@!beta:integer; {quantities used in the scaling computation}
begin @<Read past the header data; |goto 9997| if there is a problem@>;
@<Store character-width indices at the end of the |width| table@>;
@<Read and convert the width values, setting up the |in_width| table@>;
@<Move the widths from |in_width| to |width|, and append |pixel_width| values@>;
width_ptr←wp; in_TFM←true; goto 9999;
9997: print_ln('---not loaded, TFM file is bad');
@.TFM file is bad@>
9998: in_TFM←false;
9999: end;
@ @<Read past the header...@>=
read_tfm_word; lh←b2*256+b3;
read_tfm_word; font_bc[nf]←b0*256+b1; font_ec[nf]←b2*256+b3;
if font_ec[nf]<font_bc[nf] then font_bc[nf]←font_ec[nf]+1;
if width_ptr+font_ec[nf]-font_bc[nf]+1>max_widths then
begin print_ln('---not loaded, DVItype needs larger width table');
@.DVItype needs larger...@>
goto 9998;
end;
wp←width_ptr+font_ec[nf]-font_bc[nf]+1;
read_tfm_word; nw←b0*256+b1;
if (nw=0)∨(nw>256) then goto 9997;
for k←1 to 3+lh do
begin if eof(tfm_file) then goto 9997;
read_tfm_word;
if k=4 then
if b0<128 then tfm_check_sum←((b0*256+b1)*256+b2)*256+b3
else tfm_check_sum←(((b0-256)*256+b1)*256+b2)*256+b3;
end;
@ @<Store character-width indices...@>=
if wp>0 then for k←width_ptr to wp-1 do
begin read_tfm_word;
if b0>nw then goto 9997;
width[k]←b0;
end;
@ The most important part of |in_TFM| is the width computation, which
involves multiplying the relative widths in the \.{TFM} file by the
scaling factor in the \.{DVI} file. This fixed-point multiplication
must be done with precisely the same accuracy by all \.{DVI}-reading programs,
in order to validate the assumptions made by \.{DVI}-writing programs
like \TeX82.
Let us therefore summarize what needs to be done. Each width in a \.{TFM}
file appears as a four-byte quantity called a |fix_word|. A |fix_word|
whose respective bytes are $(a,b,c,d)$ represents the number
$$x=\left\{\vcenter{\halign{\lft{$#$,}\qquad&if \lft{$#$}\cr
b\cdot2↑{-4}+c\cdot2↑{-12}+d\cdot2↑{-20}&a=0;\cr
-16+b\cdot2↑{-4}+c\cdot2↑{-12}+d\cdot2↑{-20}&a=255.\cr}}\right.$$
(No other choices of $a$ are allowed, since the magnitude of a \.{TFM}
dimension must be less than 16.) We want to multiply this quantity by the
integer@@|z|, which is known to be less then $2↑{27}$. Let $\alpha=16z$.
If $|z|<2↑{23}$, the individual multiplications $b\cdot z$, $c\cdot z$,
$d\cdot z$ cannot overflow; otherwise we will divide |z| by 2, 4, 8, or
16, to obtain a multiplier less than $2↑{23}$, and we can compensate for
this later. If |z| has thereby been replaced by $|z|↑\prime=|z|/2↑e$, let
$\beta=2↑{4-e}$; we shall compute
$$\lfloor(b+c\cdot2↑{-8}+d\cdot2↑{-16})\,z↑\prime/\beta\rfloor$$ if $a=0$,
or the same quantity minus $\alpha$ if $a=255$. This calculation must be
done exactly, for the reasons stated above; the following program does the
job in a system-independent way, assuming that arithmetic is exact on
numbers less than $2↑{31}$ in magnitude.
@<Read and convert the width values...@>=
@<Replace |z| by $|z|↑\prime$ and compute $\alpha,\beta$@>;
for k←0 to nw-1 do
begin read_tfm_word;
in_width[k]←(((((b3*z)div@'400)+(b2*z))div@'400)+(b1*z))div beta;
if b0>0 then if b0<255 then goto 9997
else in_width[k]←in_width[k]-alpha;
end
@ @<Replace |z|...@>=
begin alpha←16*z; beta←16;
while z≥@'40000000 do
begin z←z div 2; beta←beta div 2;
end;
end
@ A \.{DVI}-reading program usually works with font files instead of
\.{TFM} files, so \.{DVItype} is atypical in that respect. Font files
should, however, contain exactly the same character width data that is
found in the corresponding \.{TFM}s. In addition, font files usually
also contain the widths of characters in pixels, since the device-independent
character widths of \.{TFM} files are generally not perfect multiples of
pixels.
The |pixel_width| array contains this information; when |width[k]| is the
device-independent width of some character in \.{DVI} units, |pixel_width[k]|
is the corresponding width of that character in an actual font.
The macro |char_pixel_width| is set up to be analogous to |char_width|.
@d char_pixel_width(#)==pixel_width[width_base[#]+char_width_end
@<Glob...@>=
@!pixel_width:array[0..max_widths] of integer; {actual character widths,
in pixels}
@!conv:real; {converts \.{DVI} units to pixels}
@!true_conv:real; {converts unmagnified \.{DVI} units to pixels}
@ The following code computes pixel widths by simply rounding the \.{TFM}
widths to the nearest integer number of pixels, based on the conversion factor
|conv| that converts \.{DVI} units to pixels. However, such a simple
formula will not be valid for all fonts, and it will often give results that
are off by |@t$\pm1$@>| when a low-resolution font has been carefully
hand-fitted. For example, a font designer often wants to make the letter `m'
a pixel wider or narrower in order to make the font appear more consistent.
\.{DVI}-to-printer programs should therefore input the correct pixel width
information from font files whenever there is a chance that it may differ.
A warning message may also be desirable in the case that at least one character
is found whose pixel width differs from |conv*width| by more than a full pixel.
@d pixel_round(#)==trunc(conv*(#)+0.5)
@<Move the widths from |in_width| to |width|, and append |pixel_width| values@>=
width_base[nf]←width_ptr-font_bc[nf];
if wp>0 then for k←width_ptr to wp-1 do
begin width[k]←in_width[width[k]];
pixel_width[k]←pixel_round(width[k]);
end
@* Optional modes of output.
\.{DVItype} will print different quantities of information based on some
options that the user can specify: The |out_mode| level is set to one of
three values (|errors_only|, |terse|, |verbose|), giving different degrees
of output; and the typeout can be confined to a restricted subset of the
pages by specifying the desired starting page and the maximum number
of pages. Furthermore there is an option to specify the resolution of an
assumed discrete output device, so that pixel-oriented calculations will
be shown; and there is an option to override the magnification factor
that is stated in the \.{DVI} file.
The starting page is specified by giving a sequence of 1 to 10 numbers or
asterisks separated by dots. For example, the specification `\.{1.*.-5}'
can be used to refer to a page output by \TeX\ when $\.{\\count0}=1$
and $\.{\\count2}=-5$. (Recall that |bop| commands in a \.{DVI} file
are followed by ten `count' values.) An asterisk matches any number,
so the `\.*' in `\.{1.*.-5}' means that \.{\\count1} is ignored when
specifying the first page. If several pages match the given specification,
\.{DVItype} will begin with the earliest such page in the file. The
default specification `\.*' (which matches all pages) therefore denotes
the page at the beginning of the file.
When \.{DVItype} begins, it engages the user in a brief dialog so that the
options can be specified. This part of \.{DVItype} requires nonstandard
\PASCAL\ constructions to handle the online interaction; so it may be
preferable in some cases to omit the dialog and simply to stick to the
default options (|out_mode=verbose|, starting page `\.*',
|max_pages=1000000|, |resolution=240.0|, |new_mag=0|). On other hand, the
system-dependent routines that are needed are not complicated, so it will
not be terribly difficult to introduce them.
@↑system dependencies@>
@d errors_only=0 {value of |out_mode| when minimal printing occurs} @d
terse=1 {value of |out_mode| for abbreviated output} @d verbose=2 {value
of |out_mode| when you want the works}
@<Glob...@>=
@!out_mode:errors_only..verbose; {controls the amount of output}
@!max_pages:integer; {at most this many |bop..eop| pages will be printed}
@!resolution:real; {pixels per inch}
@!new_mag:integer; {if positive, overrides the postamble's magnification}
@ The starting page specification is recorded in two arrays called
|start_count| and |start_there|. For example, `\.{1.*.-5}' is represented
by |start_there[0]=true|, |start_count[0]=1|, |start_there[1]=false|,
|start_there[2]=true|, |start_count[2]=-5|.
We also set |start_vals=2|, to indicate that count 2 was the last one
mentioned. The other values of |start_count| and |start_there| are not
important.
@<Glob...@>=
@!start_count:array[0..9] of integer; {count values to select starting page}
@!start_there:array[0..9] of boolean; {is the |start_count| value relevant?}
@!start_vals:0..9; {the last count considered significant}
@!count:array[0..9] of integer; {the count values on the current page}
@ @<Set init...@>=
out_mode←verbose; max_pages←1000000; start_vals←0; start_there[0]←false;
@ Here is a simple subroutine that tests if the current page might be the
starting page.
@p function start_match:boolean; {does |count| match the starting spec?}
var k:0..9; {loop index}
@!match:boolean; {does everything match so far?}
begin match←true;
for k←0 to start_vals do
if start_there[k]∧(start_count[k]≠count[k]) then match←false;
start_match←match;
end;
@ The |input_ln| routine waits for the user to type a line at his or her
terminal; then it puts ascii-code equivalents for the characters on that line
into the |buffer| array. The |term_in| file is used for terminal input,
and |term_out| for terminal output.
@↑system dependencies@>
@<Glob...@>=
@!buffer:array[0..terminal_line_length] of ascii_code;
@!term_in:text_file; {the terminal, considered as an input file}
@!term_out:text_file; {the terminal, considered as an output file}
@ Since the terminal is being used for both input and output, some systems
need a special routine to make sure that the user can see a prompt message
before waiting for input based on that message. (Otherwise the message
may just be sitting in a hidden buffer somewhere, and the user will have
no idea what the program is waiting for.) We shall call a system-dependent
subroutine |update_terminal| in order to avoid this problem.
@↑system dependencies@>
@d update_terminal == break(term_out) {empty the terminal output buffer}
@ During the dialog, \.{DVItype} will treat the first blank space in a
line as the end of that line. Therefore |input_ln| makes sure that there
is always at least one blank space in |buffer|.
@↑system dependencies@>
@p procedure input_ln; {inputs a line from the terminal}
var k:0..terminal_line_length;
begin update_terminal; reset(term_in);
if eoln(term_in) then read_ln(term_in);
k←0;
while (k<terminal_line_length)∧ not eoln(term_in) do
begin buffer[k]←xord[term_in↑]; incr(k); get(term_in);
end;
buffer[k]←" ";
end;
@ The global variable |buf_ptr| is used while scanning each line of input;
it points to the first unread character in |buffer|.
@<Glob...@>=
@!buf_ptr:0..terminal_line_length; {the number of characters read}
@ Here is a routine that scans a (possibly signed) integer and computes
the decimal value. If no decimal integer starts at |buf_ptr|, the
value 0 is returned. The integer should be less than $2↑{31}$ in
absolute value.
@p function get_integer:integer;
var x:integer; {accumulates the value}
@!negative:boolean; {should the value be negated?}
begin if buffer[buf_ptr]="-" then
begin negative←true; incr(buf_ptr);
end
else negative←false;
x←0;
while (buffer[buf_ptr]≥"0")∧(buffer[buf_ptr]≤"9") do
begin x←10*x+buffer[buf_ptr]-"0"; incr(buf_ptr);
end;
if negative then get_integer←-x @+ else get_integer←x;
end;
@ The selected options are put into global variables by the |dialog|
procedure, which is called just as \.{DVItype} begins.
@↑system dependencies@>
@p procedure dialog;
label 1,2,3,4,5;
var k:integer; {loop variable}
begin rewrite(term_out); {prepare the terminal for output}
write_ln(term_out,banner);
@<Determine the desired |out_mode|@>;
@<Determine the desired |start_count| values@>;
@<Determine the desired |max_pages|@>;
@<Determine the desired |resolution|@>;
@<Determine the desired |new_mag|@>;
@<Print all the selected options@>;
end;
@ @<Determine the desired |out_mode|@>=
1: write(term_out,'Output level (default=2, ? for help): ');
out_mode←2; input_ln;
if buffer[0]≠" " then
if (buffer[0]≥"0")∧(buffer[0]≤"2") then out_mode←buffer[0]-"0"
else begin write(term_out,'Type 2 for complete listing,');
write(term_out,' 0 for errors only,');
write_ln(term_out,' 1 for something in between.');
goto 1;
end
@ @<Determine the desired |start...@>=
2: write(term_out,'Starting page (default=*): ');
start_vals←0; start_there[0]←false;
input_ln; buf_ptr←0; k←0;
if buffer[0]≠" " then
repeat if buffer[buf_ptr]="*" then
begin start_there[k]←false; incr(buf_ptr);
end
else begin start_there[k]←true; start_count[k]←get_integer;
end;
if (k<9)∧(buffer[buf_ptr]=".") then
begin incr(k); incr(buf_ptr);
end
else if buffer[buf_ptr]=" " then start_vals←k
else begin write(term_out,'Type, e.g., 1.*.-5 to specify the ');
write_ln(term_out,'first page with \count0=1, \count2=-5.');
goto 2;
end;
until start_vals=k
@ @<Determine the desired |max_pages|@>=
3: write(term_out,'Maximum number of pages (default=1000000): ');
max_pages←1000000; input_ln; buf_ptr←0;
if buffer[0]≠" " then
begin max_pages←get_integer;
if max_pages≤0 then
begin write_ln(term_out,'Please type a positive number.');
goto 3;
end;
end
@ @<Determine the desired |resolution|@>=
4: write(term_out,'Assumed device resolution');
write(term_out,' in pixels per inch (default=240/1): ');
resolution←240.0; input_ln; buf_ptr←0;
if buffer[0]≠" " then
begin k←get_integer;
if (k>0)∧(buffer[buf_ptr]="/")∧
(buffer[buf_ptr+1]>"0")∧(buffer[buf_ptr+1]≤"9") then
begin incr(buf_ptr); resolution←k/get_integer;
end
else begin write(term_out,'Type a ratio of positive integers;');
write_ln(term_out,' (1 pixel per mm would be 254/10).');
goto 4;
end;
end
@ @<Determine the desired |new_mag|@>=
5: write(term_out,'New magnification (default=0 to keep the old one): ');
new_mag←0; input_ln; buf_ptr←0;
if buffer[0]≠" " then
if (buffer[0]≥"0")∧(buffer[0]≤"9") then new_mag←get_integer
else begin write(term_out,'Type a positive integer to override ');
write_ln(term_out,'the magnification in the DVI file.');
goto 5;
end
@ After the dialog is over, we print the options so that the user
can see what \.{DVItype} thought was specified.
@<Print all the selected options@>=
print_ln('Options selected:');
print(' Starting page = ');
for k←0 to start_vals do
begin if start_there[k] then print(start_count[k]:0)
else print('*');
if k<start_vals then print('.')
else print_ln(' ');
end;
print_ln(' Maximum number of pages = ',max_pages:0);
print(' Output level = ',out_mode:0);
case out_mode of
errors_only: print_ln(' (showing bops and error messages only)');
terse: print_ln(' (terse)');
verbose: print_ln(' (verbose)');
end;@/
print_ln(' Resolution = ',resolution:12:8,' pixels per inch');
if new_mag>0 then print_ln(' New magnification factor = ',new_mag/1000:8:3)
@* Low level output routines.
Simple text in the \.{DVI} file is saved in a buffer until |line_length-2|
characters have accumulated, or until some non-simple \.{DVI} operation
occurs. Then the accumulated text is printed on a line, surrounded by
brackets. The global variable |text_ptr| keeps track of the number of
characters currently in the buffer.
@<Glob...@>=
@!text_ptr:0..line_length; {the number of characters in |text_buf|}
@!text_buf:array[1..line_length] of ascii_code; {saved characters}
@ @<Set init...@>=
text_ptr←0;
@ The |flush_text| procedure will empty the buffer if there is something in it.
@p procedure flush_text;
var k:0..line_length; {index into |text_buf|}
begin if text_ptr>0 then
begin if out_mode>errors_only then
begin print('[');
for k←1 to text_ptr do print(xchr[text_buf[k]]);
print_ln(']');
end;
text_ptr←0;
end;
end;
@ And the |out_text| procedure puts something in it.
@p procedure out_text(c:ascii_code);
begin if text_ptr=line_length-2 then flush_text;
incr(text_ptr); text_buf[text_ptr]←c;
end;
@* Translation to symbolic form.
The main work of \.{DVItype} is accomplished by the |do_page| procedure,
which produces the output for an entire page, assuming that the |bop|
command for that page has already been processed. This procedure is
essentially an interpretive routine that reads and acts on the \.{DVI}
commands.
@ The definition of \.{DVI} files refers to six registers,
$(h,v,w,x,y,z)$, which hold integer values in \.{DVI} units. In practice,
we also need registers |hh| and |vv|, the pixel analogs of $h$ and $v$,
since it is not always true that |hh=pixel_round(h)| or
|vv=pixel_round(v)|.
The stack of $(h,v,w,x,y,z)$ values is represented by eight arrays
called |hstack|, $\ldotss$, |zstack|, |hhstack|, and |vvstack|.
@<Glob...@>=
@!h,@!v,@!w,@!x,@!y,@!z,@!hh,@!vv:integer; {current state values}
@!hstack,@!vstack,@!wstack,@!xstack,@!ystack,@!zstack:
array [0..stack_size] of integer; {pushed down values in \.{DVI} units}
@!hhstack,@!vvstack:
array [0..stack_size] of integer; {pushed down values in pixels}
@ Three characteristics of the pages (their |max_v|, |max_h|, and
|max_stack_depth|) are specified in the postamble, and a warning message
is printed if these limits are exceeded. Actually |max_v| is set to
the maximum height plus depth of a page, and |max_h| to the maximum width,
for purposes of page layout. Since characters can legally be set outside
of the page boundaries, it is not an error when |max_v| or |max_h| is
exceeded. But |max_stack_depth| should not be exceeded.
@<Glob...@>=
@!max_v:integer; {the value of |abs(v)| should probably not exceed this}
@!max_h:integer; {the value of |abs(h)| should probably not exceed this}
@!max_stack_depth:integer; {the stack depth should not exceed this}
@ Before we get into the details of |do_page|, it is convenient to
consider a simpler routine that computes the first parameter of each
opcode.
@d four_cases(#)==#,#+1,#+2,#+3
@d eight_cases(#)==four_cases(#),four_cases(#+4)
@d sixteen_cases(#)==eight_cases(#),eight_cases(#+8)
@d thirty_two_cases(#)==sixteen_cases(#),sixteen_cases(#+16)
@d sixty_four_cases(#)==thirty_two_cases(#),thirty_two_cases(#+32)
@p function first_par(o:eight_bits):integer;
begin case o of
sixty_four_cases(set_char_0),sixty_four_cases(set_char_0+64):
first_par←o-set_char_0;
set1,put1,fnt1,xxx1: first_par←get_byte;
set1+1,put1+1,fnt1+1,xxx1+1: first_par←get_two_bytes;
set1+2,put1+2,fnt1+2,xxx1+2: first_par←get_three_bytes;
right1,w1,x1,down1,y1,z1: first_par←signed_byte;
right1+1,w1+1,x1+1,down1+1,y1+1,z1+1: first_par←signed_pair;
right1+2,w1+2,x1+2,down1+2,y1+2,z1+2: first_par←signed_trio;
set1+3,set_rule,put1+3,put_rule,right1+3,w1+3,x1+3,down1+3,y1+3,z1+3,
fnt1+3,xxx1+3: first_par←signed_quad;
nop,bop,eop,push,pop,pst,undefined_commands: first_par←0;
w0: first_par←w;
x0: first_par←x;
y0: first_par←y;
z0: first_par←z;
sixty_four_cases(fnt_num_0): first_par←o-fnt_num_0;
end;
end;
@ Here is another subroutine that we need: It computes the number of
pixels in the height or width of a rule. Characters and rules will line up
properly if the sizes are computed precisely as specified here. (Since
|conv| is computed with some floating-point roundoff error, in a
machine-dependent way, format designers who are tailoring something for a
particular resolution should not plan their measurements to come out to an
exact integer number of pixels; they should compute things so that the
rule dimensions are a little less than an integer number of pixels, e.g.,
4.99 instead of 5.00.)
@p function rule_pixels(x:integer):integer;
{computes $\lceil|conv|\cdot x\rceil$}
var n:integer;
begin n←trunc(conv*x);
if n<conv*x then rule_pixels←n+1 @+ else rule_pixels←n;
end;
@ Strictly speaking, the |do_page| procedure is really a function with
side effects, not a `\&{procedure}'; it returns the value |false| if
\.{DVItype} should be aborted because of some unusual happening. The
subroutine is organized as a typical interpreter, with a multiway branch
on the command code followed by |goto| statements leading to routines that
finish up the activities common to different commands. We will use the
following labels:
@d fin_set=41 {label for commands that set or put a character}
@d fin_rule=42 {label for commands that set or put a rule}
@d move_right=43 {label for commands that change |h|}
@d move_down=44 {label for commands that change |v|}
@d show_state=45 {label for commands that change |s|}
@d change_font=46 {label for commands that change |cur_font|}
@d done=30 {label for the end of a command}
@ Some \PASCAL\ compilers severely restrict the length of procedure bodies,
so we shall split |do_page| into two parts, one of which is
called |special_cases|. The different parts communicate with each other
via the following global variables.
@<Glob...@>=
@!s:integer; {current stack size}
@!ss:integer; {stack size to print}
@!cur_font:integer; {current internal font number}
@!showing:boolean; {is the current command being translated in full?}
@ Here is the overall setup.
@p @<Declare the function called |special_cases|@>@;
function do_page:boolean;
label fin_set,fin_rule,move_right,show_state,done,9998,9999;
var o:eight_bits; {operation code of the current command}
@!p,@!q:integer; {parameters of the current command}
@!a:integer; {byte number of the current command}
begin cur_font←nf; {set current font undefined}
s←0; h←0; v←0; w←0; x←0; y←0; z←0; hh←0; vv←0;
{initialize the state variables}
while true do @<Translate the next command in the \.{DVI} file;
|goto 9999| with |do_page=true| if it was |eop|;
|goto 9998| if premature termination is needed@>;
9998: print_ln('!'); do_page←false;
9999: end;
@ Commands are broken down into ``major'' and ``minor'' categories:
A major command is always shown in full, while a minor one is
put into the buffer in abbreviated form. Minor commands, which
account for the bulk of most \.{DVI} files, involve horizontal spacing
and the typesetting of characters in a line; these are shown in full
only if |out_mode=verbose|.
@d show(#)==begin flush_text; showing←true; print(a:0,': ',#);
end
@d major(#)==if out_mode>errors_only then show(#)
@d minor(#)==if out_mode=verbose then
begin showing←true; print(a:0,': ',#);
end
@d error(#)==if not showing then show(#) else print(' ',#)
@<Translate the next command...@>=
begin a←cur_loc; showing←false;
o←get_byte; p←first_par(o);
@<Start translation of command |o| and |goto| the appropriate label to
finish the job@>;
fin_set: @<Finish a command that either sets or puts a character, then
|goto move_right| or |done|@>;
fin_rule: @<Finish a command that either sets or puts a rule, then
|goto move_right| or |done|@>;
move_right: @<Finish a command that sets |h←h+q|, then |goto done|@>;
show_state: @<Show the values of |ss|, |h|, |v|, |w|, |x|, |y|, |z|,
|hh|, and |vv|; then |goto done|@>;
done: if showing then print_ln(' ');
end
@ The multiway switch in |first_par|, above, was organized by the length
of each command; the one in |do_page| is organized by the semantics.
@<Start translation...@>=
if o<set_char_0+128 then @<Translate a |set_char| command@>
else case o of
four_cases(set1): begin major('set',o-set1+1:0,' ',p:0); goto fin_set;
end;
set_rule: begin major('setrule'); goto fin_rule;
end;
put_rule: begin major('putrule'); goto fin_rule;
end;
@t\4@>@<Cases for |nop|, |bop|, $\ldotss$, |pop|@>@;
@t\4@>@<Cases for horizontal motion@>@;
othercases if special_cases(o,p,a) then goto done@+else goto 9998
endcases
@ @<Declare the function called |special_cases|@>=
function special_cases(@!o:eight_bits;@!p,@!a:integer):boolean;
label change_font,move_down,done,9998;
var q:integer; {parameter of the current command}
@!k:integer; {loop index}
@!bad_char:boolean; {has a non-ascii character code appeared in this |xxx|?}
@!pure:boolean; {is the command error-free?}
begin pure←true;
case o of
four_cases(put1): begin major('put',o-put1+1:0,' ',p:0); goto done;
end;
@t\4@>@<Cases for vertical motion@>@;
sixty_four_cases(fnt_num_0): begin major('fntnum',p:0);
goto change_font;
end;
four_cases(fnt1): begin major('fnt',o-fnt1+1:0,' ',p:0);
goto change_font;
end;
four_cases(xxx1): @<Translate an |xxx| command and |goto done|@>;
pst: begin error('pst occurred before eop'); goto 9998;
@.pst occurred before eop@>
end;
othercases begin error('undefined commend ',o:0,'!');
goto done;
@.undefined command@>
end
endcases;
move_down: @<Finish a command that sets |v←v+p|, then |goto done|@>;
change_font: @<Finish a command that changes the current font,
then |goto done|@>;
9998: pure←false;
done: special_cases←pure;
end;
@ @<Cases for |nop|, |bop|, $\ldotss$, |pop|@>=
nop: begin minor('nop'); goto done;
end;
bop: begin error('bop occurred before eop'); goto 9998;
@.bop occurred before eop@>
end;
eop: begin major('eop');
if s≠0 then error('stack not empty at end of page (level ',
s:0,')!');
@.stack not empty...@>
do_page←true; goto 9999;
end;
push: begin major('push');
if s=max_stack_depth then error('deeper than claimed in postamble!');
@.deeper than claimed...@>
@.push deeper than claimed...@>
if s=stack_size then
begin error('DVItype capacity exceeded (stack size=',
stack_size:0,')'); goto 9998;
end;
hstack[s]←h; vstack[s]←v; wstack[s]←w;
xstack[s]←x; ystack[s]←y; zstack[s]←z;
hhstack[s]←hh; vvstack[s]←vv; incr(s); ss←s-1; goto show_state;
end;
pop: begin major('pop');
if s=0 then error('(illegal at level zero)!')
else begin decr(s); hh←hhstack[s]; vv←vvstack[s];
h←hstack[s]; v←vstack[s]; w←wstack[s];
x←xstack[s]; y←ystack[s]; z←zstack[s];
end;
ss←s; goto show_state;
end;
@ Rounding to the nearest pixel is best done in the manner shown here, so as
to be inoffensive to the eye: When the horizontal motion is small, like a
kern, |hh| changes by rounding the kern; but when the motion is large, |hh|
changes by rounding the true position |h| so that accumulated rounding errors
disappear.
@d out_space(#)==if abs(p)≥font_space[cur_font] then
begin out_text(" "); hh←pixel_round(h+p);
end
else hh←hh+pixel_round(p);
minor(#,' ',p:0); q←p; goto move_right
@<Cases for horizontal motion@>=
four_cases(right1):begin out_space('right',o-right1+1:0);
end;
w0,four_cases(w1):begin w←p; out_space('w',o-w0:0);
end;
x0,four_cases(x1):begin x←p; out_space('x',o-x0:0);
end;
@ Vertical motion is done similarly, but with the threshold between
``small'' and ``large'' increased by a factor of five. The idea is to make
fractions like ``$1\over2$'' round consistently, but to absorb accumulated
rounding errors in the baseline-skip moves.
@d out_vmove(#)==if abs(p)≥5*font_space[cur_font] then vv←pixel_round(v+p)
else vv←vv+pixel_round(p);
major(#,' ',p:0); goto move_down
@<Cases for vertical motion@>=
four_cases(down1):begin out_vmove('down',o-down1+1:0);
end;
y0,four_cases(y1):begin y←p; out_vmove('y',o-y0:0);
end;
z0,four_cases(z1):begin z←p; out_vmove('z',o-z0:0);
end;
@ @<Translate an |xxx| command and |goto done|@>=
begin major('xxx'''); bad_char←false;
for k←1 to p do
begin q←get_byte;
if (q≥"!")∧(q≤"~") then
begin if showing then print(xchr[q]);
end
else bad_char←true;
end;
if showing then print('''');
if bad_char then error('non-ascii character in xxx command!');
@.non-ascii character...@>
goto done;
end
@ @<Translate a |set_char|...@>=
begin if (o>" ")∧(o≤"~") then
begin out_text(p); minor('setchar',p:0);
end
else major('setchar',p:0);
goto fin_set;
end
@ @<Finish a command that either sets or puts a character...@>=
if font_ec[cur_font]=256 then p←256; {width computation for oriental fonts}
if (p<font_bc[cur_font])∨(p>font_ec[cur_font]) then q←invalid_width
else q←char_width(cur_font)(p);
if q=invalid_width then
begin error('character ',p:0,' invalid in font ');
@.character $c$ invalid...@>
print_font(cur_font);
if cur_font≠nf then print('!');
end;
if o≥put1 then goto done;
if q=invalid_width then q←0
else hh←hh+char_pixel_width(cur_font)(p);
goto move_right
@ @<Finish a command that either sets or puts a rule...@>=
q←signed_quad;
if showing then
begin print(' height ',p:0,', width ',q:0);
if (p≤0)∨(q≤0) then print(' (invisible)')
else print(' (',rule_pixels(p):0,'x',rule_pixels(q):0,' pixels)');
end;
if o=put_rule then goto done;
print_ln(' ');
hh←hh+rule_pixels(q); goto move_right
@ Since \.{DVItype} is intended to diagnose strange errors, it checks
carefully to make sure that |h| and |v| do not get out of range.
Normal \.{DVI}-reading programs need not do this.
@d infinity==@'17777777777 {$\infty$ (approximately)}
@<Finish a command that sets |h←h+q|, then |goto done|@>=
if (h>0)∧(q>0) then if h>infinity-q then
begin error('arithmetic overflow! parameter changed from ',
@.arithmetic overflow...@>
q:0,' to ',infinity-h:0);
q←infinity-h;
end;
if (h<0)∧(q<0) then if -h>q+infinity then
begin error('arithmetic overflow! parameter changed from ',
q:0, ' to ',(-h)-infinity:0);
q←(-h)-infinity;
end;
if showing then
begin print(' h:=',h:0);
if q≥0 then print('+');
print(q:0,'=',h+q:0,', hh:=',hh:0);
end;
h←h+q;
if abs(h)>max_h then
begin error('warning: |h|>',max_h:0,'!');
@.warning:...@>
max_h←abs(h);
end;
goto done
@ @<Finish a command that sets |v←v+p|, then |goto done|@>=
if (v>0)∧(p>0) then if v>infinity-p then
begin error('arithmetic overflow! parameter changed from ',
@.arithmetic overflow...@>
p:0,' to ',infinity-v:0);
p←infinity-v;
end;
if (v<0)∧(p<0) then if -v>p+infinity then
begin error('arithmetic overflow! parameter changed from ',
p:0, ' to ',(-v)-infinity:0);
p←(-v)-infinity;
end;
if showing then
begin print(' v:=',v:0);
if p≥0 then print('+');
print(p:0,'=',v+p:0,', vv:=',vv:0);
end;
v←v+p;
if abs(v)>max_v then
begin error('warning: |v|>',max_v:0,'!');
@.warning:...@>
max_v←abs(v);
end;
goto done
@ @<Show the values of |ss|, |h|, |v|, |w|, |x|, |y|, |z|...@>=
if showing then
begin print_ln(' ');
print('level ',ss:0,':(h=',h:0,',v=',v:0,
',w=',w:0,',x=',x:0,',y=',y:0,',z=',z:0,
',hh=',hh:0,',vv=',vv:0,')');
end;
goto done
@ @<Finish a command that changes the current font...@>=
font_num[nf]←p; cur_font←0;
while font_num[cur_font]≠p do incr(cur_font);
if showing then
begin print(' current font is '); print_font(cur_font);
end;
goto done
@* Finding the postamble and the starting page.
\.{DVItype} makes a first pass over the given \.{DVI} file in order to
(a)@@count the total number of pages (|page_count|); (b)@@find the
starting byte of the postamble (|pst_loc|) and prepare to read it;
(c)@@find the starting byte of the |bop| command that corresponds to the
specified starting page (|start_loc|).
If |random_reading| is true, this can be done by looking at only a
small percentage of the total number of bytes in a typical \.{DVI}
file. Otherwise the program marches through the whole file.
@<Glob...@>=
@!page_count:integer; {the number of |bop| commands in the \.{DVI} file}
@!pst_loc:integer; {address of the |pst| byte}
@!start_loc:integer; {address of the |bop| byte where translation should start}
@ If the \.{DVI} file is badly malformed, the whole process is aborted and
\.{DVItype} gives up, after issuing an error message about the symptoms
that were noticed.
@d abort(#)==begin print(' ',#); goto final_end;
end
@d bad_dvi(#)==abort('Bad DVI file: ',#,'!')
@.Bad DVI file@>
@ Some integer variables are used to control this initialization.
@<Glob...@>=
@!d,@!k,@!m,@!n,@!p,@!q,@!r:integer; {pointers into the \.{DVI} file,
or miscellaneous registers for temporary use}
@ Here is the program that takes care of the first ``pass'':
@<Do a quick look at the file, moving to byte number |pst_loc+5|@>=
open_dvi_file; pst_loc←-1; start_loc←-1; page_count←0;
if random_reading then
begin @<Find |pst_loc|, working backwards from the end of the file@>;
@<Count the pages and find |start_loc|@>;
move_to_byte(pst_loc+5);
end
else begin @<Go through the file, counting pages and finding |start_loc|
and |pst_loc|@>;
@<Check the pointer to the previous |bop|@>;
end
@ @<Find |pst_loc|, working backwards from the end of the file@>=
n←dvi_length;
if n<42 then bad_dvi('only ',n:0,' bytes long');
m←n-4;
repeat if m=0 then bad_dvi('all 223s');
move_to_byte(m); k←get_byte; decr(m);
until k≠223;
if k≠id_byte then bad_dvi('ID byte is ',k:0);
move_to_byte(m-3); q←signed_quad;
if (q<0)∨(q>m-36) then bad_dvi('pst pointer ',q:0,' at byte ',m-3:0);
move_to_byte(q); k←get_byte;
if k≠pst then bad_dvi('byte ',q:0,' is not pst');
pst_loc←q
@ At this point in the program, the \.{DVI} file is positioned ready to
read byte |pst_loc+1|. Normal \.{DVI}-reading programs would now proceed
immediately to look at the postamble; and there would be no need to
count the pages or to search for |start_loc| if all pages of the file are
to be processed. Thus, the loop shown here is not typical; it is
rather specific to \.{DVItype}'s mission of checking the file carefully.
@<Count the pages and find |start_loc|@>=
repeat p←signed_quad;
{now |q| points to a |pst| or |bop| command; |p| is prev pointer}
if (p>q-46)∧(p≥0) then
bad_dvi('page link ',p:0,' after byte ',q:0);
if p≥0 then
begin q←p; move_to_byte(q); k←get_byte;
if k=bop then incr(page_count)
else bad_dvi('byte ',q:0,' is not bop');
for k←0 to 9 do count[k]←signed_quad;
if start_match then start_loc←q;
end
else if q>0 then
begin move_to_byte(0);
while cur_loc<q do
begin k←get_byte;
if k≠nop then bad_dvi('byte ',cur_loc-1:0,' is not nop');
end;
end;
until p<0
@ If the first pass is really a pass (|random_reading=false|), we do it
in slower motion:
@<Go through the file, counting pages and...@>=
repeat if eof(dvi_file) then k←0@+else k←get_byte;
until k≠nop;
if (k≠bop)∧(k≠pst) then bad_dvi('first non-nop byte is ',k:0);
p←-1;
while pst_loc<0 do
begin m←first_par(k);
if k=bop then @<Pass |bop| command@>
else if (k=set_rule)∨(k=put_rule) then m←signed_quad
else if k=pst then pst_loc←cur_loc-1
else if (k≥xxx1)∧(k<xxx1+4) then for k←1 to m do n←get_byte;
if eof(dvi_file) then bad_dvi('postamble unfindable');
if pst_loc<0 then k←get_byte;
end
@ @<Pass |bop| command@>=
begin incr(page_count);
for k←0 to 9 do count[k]←signed_quad;
if (start_loc<0)∧ start_match then start_loc←cur_loc-41;
@<Check the pointer to the previous |bop|@>;
p←cur_loc-45;
end
@ During this pass, |p| points to the previous |bop| command in the file.
@<Check the pointer to the previous |bop|@>=
begin k←signed_quad;
if k≠p then bad_dvi('backpointer in byte ',cur_loc-4:0,
' should be ',p:0);
end
@* Reading the postamble.
Now imagine that we are reading the \.{DVI} file and positioned just
four bytes after the |pst| command. That, in fact, is the situation,
when the following part of \.{DVItype} is called upon to read, translate,
and check the rest of the postamble.
@<Read, translate, and check the postamble@>=
print_ln('Postamble starts at byte ',pst_loc:0,'.');
@<Compute the conversion factor@>;
max_v←signed_quad; max_h←signed_quad; max_stack_depth←get_two_bytes;
print('maxv=',max_v:0,', maxh=',max_h:0,', maxstackdepth=',max_stack_depth:0);
m←get_two_bytes; print(', totalpages=',m:0);
if m=page_count then print_ln(' ')
else print_ln(' (should be',page_count:0,'!)');
@<Process the font definitions@>;
@<Make sure that the end of the file is well-formed@>;
@ We put the above module into a subprocedure to make a substantial
reduction in the size of the main procedure, which would be otherwise
marginally too large for some compilers. We do this rather than making a
barely adequate fix because we want to gain the freedom to make minor
changes to the program in the future without having to worry about slight
increases in program size.
@<Declaration of subprocedure to read the |postamble|@>=
procedure read_postamble; {reads translates and checks the |postamble|}
var k:integer; {loop variable}
begin
@<Add twenty lines of code@>;
@<Read, translate, and check the postamble@>;
end;
@ The following module may be useful to add a twenty lines of
code to a procedure or to the main program to increase its length, just to make
sure that it was not already marginally large.
@<Add twenty lines of code@>=
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
if xchar['40] = xchar['40] then do_nothing;
@ The conversion factor |conv| is figured as follows: There are exactly
|n/d| \.{DVI} units per decimicron, and 254000 decimicrons per inch,
and |resolution| pixels per inch. Then we have to adjust this
by the stated amount of magnification.
@<Compute the conversion factor@>=
n←signed_quad; m←signed_quad;
if n≤0 then bad_dvi('numerator is ',n:0);
if m≤0 then bad_dvi('denominator is ',m:0);
print_ln('numerator/denominator=',n:0,'/',m:0);
conv←(n/254000.0)*(resolution/m);
n←signed_quad;
if new_mag>0 then n←new_mag
else if n≤0 then bad_dvi('magnification is ',n:0);
true_conv←conv; conv←true_conv*(n/1000.0);
print_ln('magnification=',n:0,'; ',conv:16:8,' pixels per DVI unit')
@ When we get to the present code, the phony `$-1$' font number has
just been read.
@<Make sure that the end of the file is well-formed@>=
q←signed_quad;
if q≠pst_loc then
print_ln('pst pointer in byte ',cur_loc-4:0,
@.pst pointer...should be...@>
' should be ',pst_loc:0,'!');
m←get_byte;
if m≠id_byte then print_ln('identification in byte ',cur_loc-1:0,
@.identification...should be...@>
' should be ',id_byte:0,'!');
k←cur_loc; m←223;
while (m=223)∧ not eof(dvi_file) do m←get_byte;
if not eof(dvi_file) then print_ln('signature in byte ',cur_loc-1:0,
@.signature...should be...@>
' should be 223!')
else if cur_loc<k+4 then
print_ln('not enough signature bytes at end of file (',
@.not enough signature bytes...@>
cur_loc-k:0,')');
@ The most complicated part of the postamble processing is the part
that we still have to tackle, namely the definitions of the fonts.
@<Process the font definitions@>=
font_num[nf]←signed_quad;
while font_num[nf]≠-1 do
begin if eof(dvi_file) then bad_dvi('endless font definitions');
if nf=max_fonts then abort('DVItype capacity exceeded (max fonts=',
@.DVItype capacity exceeded...@>
max_fonts:0,')!');
print('Font ',font_num[nf]:0,': ');
m←signed_quad; {the check sum}
q←signed_quad; {the scaled size}
d←signed_quad; {the design size}
@<Move font name into storage and into |cur_name|@>;
@<Load the font unless there are problems@>;
font_num[nf]←signed_quad;
end
@ We substitute question marks for non-ascii characters in the font name.
@<Move font name into storage and into |cur_name|@>=
p←get_byte; {length of the area/directory spec}
n←get_byte; {length of the font name proper}
if font_name[nf]+n+p>name_size then
abort('DVItype capacity exceeded (name size=',name_size:0,')!');
@.DVItype capacity exceeded...@>
if n+p=0 then bad_dvi('null font name');
font_name[nf+1]←font_name[nf]+n+p;
for k←font_name[nf] to font_name[nf+1]-1 do
begin r←get_byte;
if (r<" ")∨(r>"~") then names[k]←"?"@+else names[k]←r;
end;
incr(nf); print_font(nf-1); decr(nf);
@<Move font name into the |cur_name| string@>
@ If |p=0|, i.e., if no font directory has been specified, \.{DVItype}
is supposed to use the default font directory, which is a
system-dependent place where the standard fonts are kept.
The string variable |default_directory| contains the name of this area.
@↑system dependencies@>
@d default_directory_name=='<SYS.FONTS>' {change this to the correct name}
@d default_directory_name_length=11 {change this to the correct length}
@<Glob...@>=
@!default_directory:packed array[1..default_directory_name_length] of char;
@ @<Set init...@>=
default_directory←default_directory_name;
@ The string |cur_name| is supposed to be set to the external name of the
\.{TFM} file for the current font. This usually means that we need to
prepend the name of the default directory, and
to append the suffix `\.{.TFM}'. Furthermore, we change lower case letters
to upper case, since |cur_name| is a \PASCAL\ string.
@↑system dependencies@>
@<Move font name into the |cur_name| string@>=
for k←1 to name_length do cur_name[k]←' ';
if p=0 then
begin for k←1 to default_directory_name_length do
cur_name[k]←default_directory[k];
r←default_directory_name_length;
end
else r←0;
for k←font_name[nf] to font_name[nf+1]-1 do
begin incr(r);
if r+4>name_length then abort('Font name is too long!');
@.Font name is too long@>
if (names[k]≥"a")∧(names[k]≤"z") then
cur_name[r]←xchr[names[k]-@'40]
else cur_name[r]←xchr[names[k]];
end;
cur_name[r+1]←'.'; cur_name[r+2]←'T'; cur_name[r+3]←'F'; cur_name[r+4]←'M'
@ @<Load the font unless there are problems@>=
k←0;
while font_num[k]≠font_num[nf] do incr(k);
if k<nf then print_ln('---not loaded, this number already used!')
@.this number already used@>
else begin open_tfm_file;
if eof(tfm_file) then
print_ln('---not loaded, TFM file can''t be opened!')
@.TFM file can\'t be opened@>
else begin if (q≤0)∨(q≥@'1000000000) then
print_ln('---not loaded, bad scale (',q:0,')!')
@.bad scale@>
else if (d≤0)∨(d≥@'1000000000) then
print_ln('---not loaded, bad design size (',d:0,')!')
@.bad design size@>
else if in_TFM(q) then @<Finish loading the new font info@>;
end;
end
@ @<Finish loading...@>=
begin font_space[nf]←q div 6; {this is a 3-unit ``thin space''}
if (m≠0)∧(tfm_check_sum≠0)∧(m≠tfm_check_sum) then
begin print_ln('---loaded but beware: check sums do not agree!');
print_ln(' (',m:0,' vs. ',tfm_check_sum:0,')');
end
else print_ln('---loaded at size ',q:0,' DVI units');
d←trunc((100.0*conv*q)/(true_conv*d)+0.5);
if d≠100 then print_ln(' (this font is magnified ',d:0,'%)');
incr(nf); {now the new font is officially present}
end
@* The main program.
Now we are ready to put it all together. This is where \.{DVItype} starts,
and where it ends.
@p @<Declaration of subprocedure to read the |postamble|@>@;
begin initialize; {get all variables initialized}
dialog; {set up all the options}
@<Do a quick look...@>;
read_postamble;
if start_loc<0 then print_ln('The starting page could not be found!')
else begin if random_reading then move_to_byte(start_loc)
else begin open_dvi_file; {prepare for second pass}
while cur_loc<start_loc do n←get_byte;
end;
@<Translate up to |max_pages| pages;
|goto final_end| if the postamble is reached@>;
end;
final_end:end.
@ The code shown here uses a convention that has proved to be useful:
If the starting page was specified as, e.g., `\.{1.*.-5}', then
all page numbers in the file are displayed by showing the values of
counts 0, 1, and@@2, separated by dots. Such numbers can, for example,
be displayed on the console of a printer when it is working on that
page.
@<Translate up to...@>=
while max_pages>0 do
begin decr(max_pages);
repeat k←get_byte;
until k≠nop;
if k=pst then goto final_end;
if k≠bop then bad_dvi('command at byte ',cur_loc-1:0,' is not bop');
print_ln(' '); print(cur_loc-1:0,': beginning of page ');
for k←0 to start_vals do
begin print(signed_quad:0);
if k<start_vals then print('.')
else print_ln(' ');
end;
for k←start_vals+1 to 10 do n←signed_quad;
{ignore remaining counts and the back pointer}
if not do_page then abort('page ended unexpectedly!');
@.page ended unexpectedly@>
end
@* System-dependent changes.
This module should be replaced, if necessary, by changes to the program
that are necessary to make \.{DVItype} work at a particular installation.
It is usually best to design your change file so that all changes to
previous modules preserve the module numbering; then everybody's version
will be consistent with the printed program. More extensive changes,
which introduce new modules, can be inserted here; then only the index
itself will get a new module number.
@↑system dependencies@>
@* Index.
Pointers to error messages appear here together with the section numbers
where each ident\-i\-fier is used.